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Status of this Memo 


This document specifies an Internet standards track protocol for the 
Internet community, and requests discussion and suggestions for 


improvements. Please refer to the current edition of the "Internet 
Official Protocol Standards" (STD 1) for the standardization state 
and status of this protocol. Distribution of this memo is unlimited. 


Copyright Notice 
Copyright (C) The Internet Society (2001). All Rights Reserved. 

Abstract 
International Telecommunication Union (ITU-T) Recommendation G.722.1 
is a wide-band audio codec, which operates at one of two selectable 
bit rates, 24kbit/s or 32kbit/s. This document describes the payload 
format for including G.722.1 generated bit streams within an RTP 
packet. Also included here are the necessary details for the use of 
G.722.1 with MIME and SDP. 

1. Conventions used in this document 
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 
"SHOULD", “SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 
document are to be interpreted as described in RFC-2119 [6]. 


2. Overview of ITU-T Recommendation G.722.1 


G.722.1 is a low complexity coder, it compresses 50Hz - 7kHz audio 
signals into one of two bit rates, 24 kbit/s or 32 kbit/s. 


The coder may be used for speech, music and other types of audio. 
Some of the applications for which this coder is suitable are: 
o Real-time communications such as videoconferencing and telephony. 


o Streaming audio 
o Archival and messaging 
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A fixed frame size of 20 ms is used, and for any given bit rate the 
number of bits in a frame is a constant. 


3. RTP payload format for G.722.1 


G.722.1 uses 20 ms frames and a sampling rate clock of 16 kHz, so the 
RTP timestamp MUST be in units of 1/16000 of a second. The RTP 
payload for G.722.1 has the format shown in Figure 1. No additional 
header specific to this payload format is required. 
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| RTP Header [3] | 
$Statetetaetetetaetatataetaetatataetaetatetaetatetetetatatetatatatatatat 
+ one or more frames of G.722.1 


+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 


Figure 1: RTP payload for G.722.1 


The encoding and decoding algorithm can change the bit rate at any 
20ms frame boundary, but no bit rate change notification is provided 
in-band with the bit stream. Therefore, a separate out-of-band 
method is REQUIRED to indicate the bit rate (see section 6 for an 
example of signaling bit rate information using SDP). For the 
payload format specified here, the bit rate MUST remain constant for 
a particular payload type value. An application MAY switch bit rates 
from packet to packet by defining two payload type values and 
switching between them. 


The assignment of an RTP payload type for this new packet format is 
outside the scope of this document, and will not be specified here. 
It is expected that the RTP profile for a particular class of 
applications will assign a payload type for this encoding, or if that 
is not done then a payload type in the dynamic range shall be chosen. 


The number of bits within a frame is fixed, and within this fixed 
frame G.722.1 uses variable length coding (e.g., Huffman coding) to 
represent most of the encoded parameters [2]. All variable length 
codes are transmitted in order from the left most (most significant - 
MSB) bit to the right most (least significant - LSB) bit, see [2] for 
more details. 


The use of Huffman coding means that it is not possible to identify 


the various encoded parameters/fields contained within the bit stream 
without first completely decoding the entire frame. 
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For the purposes of packetizing the bit stream in RTP, it is only 
necessary to consider the sequence of bits as output by the G.722.1 
encoder, and present the same sequence to the decoder. The payload 
format described here maintains this sequence. 


When operating at 24 kbit/s, 480 bits (60 octets) are produced per 
frame, and when operating at 32 kbit/s, 640 bits (80 octets) are 
produced per frame. Thus, both bit rates allow for octet alignment 
without the need for padding bits. 


Figure 2 illustrates how the G.722.1 bit stream MUST be mapped into 
an octet aligned RTP payload. 


An RTP packet SHALL only contain G.722.1 frames of the same bit rate. 


first bit last bit 
transmitted transmitted 
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 
| | 
+ sequence of bits (480 or 640) generated by the 

G.722.1 encoder for transmission | 
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 


+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ... +-+-+-+-+-+-+-+-+-+-+-+-+ 
MSB... LSB |MSB... LSB | MSB... LSB 
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ... +-+-+-+-+-+-+-+-+-+-+-+-+ 
RTP RTP RTP 
octet 1 octet 2 octet 
60 or 80 


Figure 2: The G.722.1 encoder bit stream is split into 
a sequence of octets (60 or 80 depending on 
the bit rate), and each octet is in turn 
mapped into an RTP octet. 


The ITU-T standardized bit rates for G.722.1 are 24 kbit/s and 
32kbit/s. However, the coding algorithm itself has the capability to 
run at any user specified bit rate (not just 24 and 32kbit/s) while 
maintaining an audio bandwidth of 50 Hz to 7 kHz. This rate change 
is accomplished by a linear scaling of the codec operation, resulting 
in frames with size in bits equal to 1/50 of the corresponding bit 
rate. 
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When operating at non-standard rates the payload format MUST follow 
the guidelines illustrated in Figure 2. It is RECOMMENDED that 
values in the range 16000 to 32000 be used, and that any value MUST 
be a multiple of 400 (this maintains octet alignment and does not 
then require (undefined) padding bits for each frame if not octet 


aligned). For example, a bit rate of 16.4 kbit/s will result ina 
frame of size 328 bits or 41 octets which are mapped into RTP per 
Figure 2. 


3.1 Multiple G.722.1 frames in a RTP packet 


More than one G.722.1 frame may be included in a single RTP packet by 
a sender. 


Senders have the following additional restrictions: 


o SHOULD NOT include more G.722.1 frames in a single RTP packet than 
will fit in the MTU of the RTP transport protocol. 


o All frames contained in a single RTP packet MUST be of the same 
length, that is they MUST have the same bit rate (octets per 
frame). 


o Frames MUST NOT be split between RTP packets. 


It is RECOMMENDED that the number of frames contained within an RTP 
packet be consistent with the application. For example, ina 
telephony application where delay is important, then the fewer frames 
per packet the lower the delay, whereas for a delay insensitive 
streaming or messaging application, many frames per packet would be 
acceptable. 


3.2 Computing the number of G.722.1 frames 


Information describing the number of frames contained in an RTP 
packet is not transmitted as part of the RTP payload. The only way 
to determine the number of G.722.1 frames is to count the total 
number of octets within the RTP packet, and divide the octet count by 
the number of expected octets per frame (either 60 or 80 per frame, 
for 24 kbit/s and 32 kbit/s respectively). 


4. MIME registration of G.722.1 
MIME media type name: audio 


MIME subtype: G7221 
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Required parameters: 


bitrate: the data rate for the audio bit stream. This 
parameter is necessary because the bit rate is not signaled 
within the G.722.1 bit stream. At the standard G.722.1 bit 
rates, the value MUST be either 24000 or 32000. If using the 
non-standard bit rates, then it is RECOMMENDED that values in 
the range 16000 to 32000 be used, and that any value MUST be a 
multiple of 400 (this maintains octet alignment and does not 
then require (undefined) padding bits for each frame if not 
octet aligned). 


Optional parameters: 

ptime: RECOMMENDED duration of each packet in milliseconds. 
Encoding considerations: 

This type is only defined for transfer via RIP as specified in 


a Work in Progress. 


Security Considerations: 
See Section 6 of RFC 3047. 


Interoperability considerations: none 

Published specification: 
See ITU-T Recommendation G.722.1 [2] for encoding algorithm 
details. 


Applications which use this media type: 
Audio and video streaming and conferencing tools 


Additional information: none 

Person & email address to contact for further information: 
Patrick Luthi 
Luthip@pictel.com 

Intended usage: COMMON 

Author/Change controller: 


Author: Patrick Luthi 
Change controller: IETF AVT Working Group 
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5. SDP usage of G.722.1 


When conveying information by SDP [5], the encoding name SHALL be 
"G7221" (the same as the MIME subtype). An example of the media 
representation in SDP for describing G.722.1 at 24000 bits/sec might 
be: 


m=audio 49000 RTP/AVP 121 
a=rtpmap:121 G7221/16000 
a=fmtp:121 bitrate=24000 


where "bitrate" is a variable that may take on values of 24000 or 
32000 at the standard rates, or values from 16000 to 32000 (and MUST 
be an integer multiple of 400) at the non-standard rates. 


6. Security Considerations 


RTP packets using the payload format defined in this specification 
are subject to the security considerations discussed in the RTP 
specification [3], and any appropriate RTP profile. This implies 
that confidentiality of the media streams is achieved by encryption. 
Because the data compression used with this payload format is applied 
end-to-end, encryption may be performed after compression so there is 
no conflict between the two operations. 


A potential denial-of-service threat exists for data encodings using 
compression techniques that have non-uniform receiver-end 
computational load. The attacker can inject pathological datagrams 
into the stream which are complex to decode and cause the receiver to 
be overloaded. However, this encoding does not exhibit any 
significant non-uniformity. 


As with any IP-based protocol, in some circumstances a receiver may 
be overloaded simply by the receipt of too many packets, either 
desired or undesired. Network-layer authentication may be used to 
discard packets from undesired sources, but the processing cost of 
the authentication itself may be too high. In a multicast 
environment, pruning of specific sources may be implemented in future 
versions of IGMP [7] and in multicast routing protocols to allow a 
receiver to select which sources are allowed to reach it. 
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10. Full Copyright Statement 
Copyright (C) The Internet Society (2001). All Rights Reserved. 


This document and translations of it may be copied and furnished to 
others, and derivative works that comment on or otherwise explain it 
or assist in its implementation may be prepared, copied, published 
and distributed, in whole or in part, without restriction of any 
kind, provided that the above copyright notice and this paragraph are 
included on all such copies and derivative works. However, this 
document itself may not be modified in any way, such as by removing 
the copyright notice or references to the Internet Society or other 
Internet organizations, except as needed for the purpose of 
developing Internet standards in which case the procedures for 
copyrights defined in the Internet Standards process must be 
followed, or as required to translate it into languages other than 
English. 


The limited permissions granted above are perpetual and will not be 
revoked by the Internet Society or its successors or assigns. 


This document and the information contained herein is provided on an 
"AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING 
TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING 
BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION 
HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF 
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 
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